home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Linux Cubed Series 8: LINUX Games
/
Linux Cubed Series 8 - LINUX Games.iso
/
games
/
muds
/
pennmush.000
/
pennmush-1.50-p8-linux.tar
/
pennmush
/
BIGRAMS
< prev
next >
Wrap
Text File
|
1992-03-25
|
1KB
|
28 lines
MUSH uses a simple bigram encoding scheme to compress the database in
memory. Supplied with the MUSH distribution, in bigram.h.dist, is the
default token table, based on the frequency of the top 128 bigrams in
the TinyMUD database.
Unfortunately, the composition of the TinyMUD database is quite different
from the composition of a MUSH database. It is possible to get better
compression by generating a customized table for your MUSH.
Make "bigrams", and then run it on your (uncompressed) database,
piping it through "sort -n -r". Then, dump this output to a file
and use "head" to look through it.
Look at the output file generated and take the top 128 most sensible
bigrams from that file. The program scans through everything in the
database, not just attributes (which are the things we're interested
in trying to save space on).
Unless your MUSH has some extremely common abbrevations, like "SFA"
for StarFleet Academy, being used all the time in descriptions, in
general, two-capital-letter combinations can be thrown out.
Combinations with "!" can also be thrown out; this is the program
getting fooled by the "!<number of object>" way of storing the
database.
In general, you don't have to bother with the bigram table unless
memory is becoming a major consideration.